Introduction

Objectives

This report describes the four ACT-R models and the learning outcomes produced by the changes in paramters. The report also describes how these models fit behavioral data and details the properties of the best fitting models and parameters. The specific objectives of this project is to test if the RLWM task can be modeled well by a group of pure and combined declarative learning models. After fitting the models to participant data we aim to extract parameters that may explain why and how learning resulted as obsereved. If the parameters describe individual differences in learning would the parameters predict other behavioral data like working memory capacity?

ACT-R Models

Below are the 4 ACT-R models tested. Note that the bolded names appear through-out this document.

  • RL: Pure RL model based on learning of production utility in ACT-R. learning rate (alpha) and softmax temperature are the only 2 parameters

  • LTM: A declarative model that solely depends on starage and retirieval of stimuli, response and outcome in ACT-R’s declarative memory. This model depends on decay rate, retrieval noise and

  • meta_RL: This is a combined RL - LTM model. Information about trials performed by the RL system is shared and stored in LTM (declarative) for use. An isolated (meta) RL system (a set of productions) learns and determines which sub-system, RL or LTM, is used throughout learning. Which subsystem is preferred depends on the specific set of parameters.

  • biased: This is a combined RL-LTM model. Information about trials performed by the RL system is not shared with the LTM subsystem. An additional “strategy” parameters specifies a bias towards the RL model at the 20, 40, 60, and 80 percent of learning and test trials.

Approach

Talk about BIC and the model fitting

Results

The LTM model fits the most number of participants (54) followed by the biased version of the combined RL-LTM model (18) and the meta-RL combined model at third (10). The RL only model has only one participant that fit it best.

Figure 1.

Figure 1.

There is only 1 RL best fitting model. For the most popular model, LTM, that fit (54) participants, there are only 13 best fitting parameter sets. The biased model seems to be the most diverse at 17 parameter sets for (18) participants. The meta-RL model closely follows the biased model interms of diversity of parameter sets at 8 parameter sets for (10) subjects.

BIC value descriptions

The following two boxplots (figures 2 and 3) show the medians and ranges of the BIC values that determined that the LTM model is the best fitting model. Boxplot 1 shows the range in BIC across all participants whether they fit that model best or not and therefore has equal number of data points (83). The second boxplot however displays BIC medians and ranges for only the best fitting participants for that data (that is why the plot for RL is a line representing the only data point).

Figure 2.

Figure 2.

This second boxplot was in an effort determine how well a model type fits its preferred set of behavioral data. This might be redundant. The LTM model fits data much better than the other group (This might need a statistical test).

Figure 3.

Figure 3.

Assesments of Model fits

Looking at the learning curves for the four models in Figure 4, the differences in learning rates are apparent as are other features like the separation between the two set sizes. In the plot below each data point is the average accuracy, for that number of stimulus presentations, across all parameter combinations. The LTM and RL models predict that an increase in set-size does not diminish learning rate and accuracy. But this analysis washes out the individual differences that could be captured by the diverse set of parameter combinations.

Figure 4.

Figure 4.

The panels in figure 5 show the mean accuracy for particant behavioral data. The model lines are averages across parameters for that group only. As we are aiming for an individual differences look at these data, collapsing across so much of this variablility is uninformative, as was shown above in figure 4,especially if the differences, once fit to actual behavioral data, indicate large differences in learning outcomes or cogntive faculty diagnostics like working memory capacity. Here, only the best fitting sets of parameter combinations were selected and collapsed. As can be seen in the figure below, the different model types appear to be vastly different and some charateristics of behavioral data have come through, such as the separations of the learning trajectories for the different setsizes in the RL-LTM Biased model fit. It can also be seen that some paramter sets in the LTM model also capture the diffculty associated with increasing set size (solid lines in Fig. 5B). T The LTM participants, on average have the highest accuracies for the testing phase in both set sizes but they are nearly indistinguishable from the meta-RL group for accuracy at end of learning. The biased group shows the most separation between the set size 3 and 6 at learningand also lower accuracy at test than LTM. The biased group is negligibly different from the meta-RL group for set size 3 but shows a marked difference at set size 6, closely following the behavioral data.
Figure 5.

Figure 5.

There are five outcome measures of interest in the RLWM task: accuracy at the end learning, accuracy at test, learning rate characterized as number of stimulus presentations to reach 95% accuracy, the differences in learning of set3 and set 6 and also the level of preserved learning at test for both set-sizes. The following analysis compares the model data with behavioral data.

#>    subjects  model     learnDiff
#> 1      6200 biased -1.041667e-01
#> 2      6201    LTM -5.208333e-02
#> 3      6202 biased -1.446759e-01
#> 4      6204    LTM -1.250000e-01
#> 5      6205    LTM -1.250000e-01
#> 6      6206 biased -4.861111e-02
#> 7      6207    LTM -5.439815e-02
#> 8      6209 biased -9.837963e-02
#> 9      6210    LTM  1.041667e-02
#> 10     6211    LTM  6.712963e-02
#> 11     6213 biased -1.250000e-01
#> 12     6214 biased -2.465278e-01
#> 13     6215    LTM -3.472222e-03
#> 14     6216    LTM -3.472222e-02
#> 15     6217    LTM -1.851852e-02
#> 16     6218 biased -2.337963e-01
#> 17     6219    LTM -3.472222e-02
#> 18     6220    LTM -2.662037e-02
#> 19     6223   meta -1.967593e-02
#> 20     6225    LTM -8.564815e-02
#> 21     6226    LTM -1.307870e-01
#> 22     6230 biased -1.180556e-01
#> 23     6231    LTM  4.629630e-02
#> 24     6234    LTM -1.122685e-01
#> 25     6235 biased -2.349537e-01
#> 26     6238    LTM -6.365741e-02
#> 27     6241   meta -7.407407e-02
#> 28     6242   meta -5.787037e-03
#> 29     6244 biased -2.233796e-01
#> 30     6245    LTM -7.060185e-02
#> 31     6246    LTM  3.009259e-02
#> 32     6247 biased -3.101852e-01
#> 33     6250    LTM -5.902778e-02
#> 34     6253    LTM -5.902778e-02
#> 35     6256    LTM -7.291667e-02
#> 36    15000    LTM -9.259259e-02
#> 37    15001    LTM -5.787037e-03
#> 38    15002 biased -2.256944e-01
#> 39    15003    LTM -7.638889e-02
#> 40    15004   meta -5.555556e-02
#> 41    15005 biased -2.557870e-01
#> 42    15006    LTM -1.412037e-01
#> 43    15007    LTM -1.342593e-01
#> 44    15008    LTM -7.291667e-02
#> 45    15009   meta  2.314815e-02
#> 46    15010    LTM -3.587963e-02
#> 47    15011    LTM  4.513889e-02
#> 48    15012    LTM -2.314815e-02
#> 49    15013    LTM -9.483115e-17
#> 50    15014    LTM -7.870370e-02
#> 51    15015   meta -1.446759e-01
#> 52    15016     RL -3.356481e-02
#> 53    15017 biased -2.118056e-01
#> 54    15019   meta -4.745370e-02
#> 55    15020    LTM -3.125000e-02
#> 56    15021   meta -2.314815e-02
#> 57    15022   meta  7.407407e-02
#> 58    15023    LTM -1.157407e-02
#> 59    28215 biased -7.870370e-02
#> 60    28241    LTM  8.680556e-02
#> 61    28242    LTM -1.967593e-02
#> 62    28243    LTM -4.166667e-02
#> 63    28284    LTM -1.504630e-01
#> 64    28303    LTM -6.712963e-02
#> 65    28306    LTM  1.041667e-02
#> 66    28307    LTM  4.861111e-02
#> 67    28308    LTM -2.546296e-02
#> 68    28309    LTM -1.099537e-01
#> 69    28325 biased -1.562500e-01
#> 70    28326    LTM -3.472222e-02
#> 71    28327    LTM -1.122685e-01
#> 72    28328    LTM -1.273148e-02
#> 73    28329    LTM -2.546296e-02
#> 74    28330    LTM  4.861111e-02
#> 75    28331    LTM -9.490741e-02
#> 76    29220    LTM -7.407407e-02
#> 77    29221 biased -2.164352e-01
#> 78    29227    LTM -1.006944e-01
#> 79    29239   meta  1.481481e-01
#> 80    29240    LTM -4.745370e-02
#> 81    29245    LTM  3.009259e-02
#> 82    29305 biased -1.817130e-01
#> 83    29318    LTM -8.564815e-02

It is difficult to assess what the model fits are capturing without examining the specific paramter sets more carefully or deducing if membership in a particular model group predicts some other cognitve or learning aspects of the subjects. First, for the cohort of subjects

What are the differences in learning type interms of behavioral outcomes in other tasks?

These plots show group effects for uCLIMB subjects only in python and OLCTS measures and behavioral predictors.

We have 3Back and PSS for a large majority of participants - what are the group differences if any in these outcomes based on model fit?

Chantel’s request: combine language and programming measures and compare groups.

These plots show that in the biased model, most of the subjects are at very low percentage of RL use. But also, higher rates of RL use or, more even split between RL and LTM indicates a separation between s3 and s6 learning accuracy.

If that is the case, is the inclusion of the RL component a vital part of their learning make-up, however small it is? This plot shows what this group would have looked like if they relied only on LTM.

Parameters

Parameter summary: what is the spread of the parameters across participants in the models?

How about some K-means clustering?

Some specific plans are to estimate the three LTM parameters for all 83 participants and see if they are related to WM, PSS measures. Also, how are the parameters related to the “separation” between s3 and s6?

Some more specific things to test might be effect of delay between stimulus presentations.

Individual plots: